47 research outputs found

    Replacing 6T SRAMs with 3T1D DRAMs in the L1 data cache to combat process variability

    Get PDF
    With continued technology scaling, process variations will be especially detrimental to six-transistor static memory structures (6T SRAMs). A memory architecture using three-transistor, one-diode DRAM (3T1D) cells in the L1 data cache tolerates wide process variations with little performance degradation, making it a promising choice for on-chip cache structures for next-generation microprocessors.Peer ReviewedPostprint (published version

    RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

    Full text link
    The second-order training methods can converge much faster than first-order optimizers in DNN training. This is because the second-order training utilizes the inversion of the second-order information (SOI) matrix to find a more accurate descent direction and step size. However, the huge SOI matrices bring significant computational and memory overheads in the traditional architectures like GPU and CPU. On the other side, the ReRAM-based process-in-memory (PIM) technology is suitable for the second-order training because of the following three reasons: First, PIM's computation happens in memory, which reduces data movement overheads; Second, ReRAM crossbars can compute SOI's inversion in O(1)O\left(1\right) time; Third, if architected properly, ReRAM crossbars can perform matrix inversion and vector-matrix multiplications which are important to the second-order training algorithms. Nevertheless, current ReRAM-based PIM techniques still face a key challenge for accelerating the second-order training. The existing ReRAM-based matrix inversion circuitry can only support 8-bit accuracy matrix inversion and the computational precision is not sufficient for the second-order training that needs at least 16-bit accurate matrix inversion. In this work, we propose a method to achieve high-precision matrix inversion based on a proven 8-bit matrix inversion (INV) circuitry and vector-matrix multiplication (VMM) circuitry. We design \archname{}, a ReRAM-based PIM accelerator architecture for the second-order training. Moreover, we propose a software mapping scheme for \archname{} to further optimize the performance by fusing VMM and INV crossbar. Experiment shows that \archname{} can achieve an average of 115.8Ă—\times/11.4Ă—\times speedup and 41.9Ă—\times/12.8Ă—\timesenergy saving compared to a GPU counterpart and PipeLayer on large-scale DNNs.Comment: 13pages, 13 figure

    A Health Monitoring System Based on Flexible Triboelectric Sensors for Intelligence Medical Internet of Things and its Applications in Virtual Reality

    Full text link
    The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of sensors. In this study, we designed a robust and intelligent IoMT system through the synergistic integration of flexible wearable triboelectric sensors and deep learning-assisted data analytics. We embedded four triboelectric sensors into a wristband to detect and analyze limb movements in patients suffering from Parkinson's Disease (PD). By further integrating deep learning-assisted data analytics, we actualized an intelligent healthcare monitoring system for the surveillance and interaction of PD patients, which includes location/trajectory tracking, heart monitoring and identity recognition. This innovative approach enabled us to accurately capture and scrutinize the subtle movements and fine motor of PD patients, thus providing insightful feedback and comprehensive assessment of the patients conditions. This monitoring system is cost-effective, easily fabricated, highly sensitive, and intelligent, consequently underscores the immense potential of human body sensing technology in a Health 4.0 society

    CeO2 Nanowires Inserted into Reduced Graphene Oxide as Active Electrocatalyst for Oxygen Reduction Reaction

    Get PDF
    Fabrication of an interconnected and conductive nano-architecture is a prospective strategy to design a high-performance and low cost electrocatalyst for oxygen reduction reaction (ORR). Herein, a novel nano-architecture assembled by graphene nanosheets and CeO2 nanowires (NWs) with a hierarchical structure was developed by a facile hydrothermal process using ethanol/water as solvents without any organic additives. In this framework, graphene oxide (GO) was reduced to graphene and chemical bonding formed between the GO and CeO2 NWs in a hydrothermal process. The imbedded CeO2 NWs could prevent the restacking of the graphene sheets and improved the electrical conductivity of the hybrid catalyst. The effect of different ratios of GO to CeO2 NWs in the hybrid were studied. The GO3-CeO2 NWs composite exhibited better catalytic performance with slow attenuation and high limiting current density 3.55 and 1.99 times higher than CeO2 NWs and pure GO. The onset potential of GO3-CeO2 NWs is 0.13 V and 0.05 V positive shift from that of CeO2 NWs and pure GO, respectively, suggesting that the GO3-CeO2 NWs hybrid had an excellent stability and activity for ORR. It was found that CeO2 NWs served not only as an effective catalyst but also as an “oxygen buffer” to relieve oxygen insufficiency for ORR
    corecore